256 research outputs found

    Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning

    Full text link
    Large Language models (LLMs) possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions. However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats. In this work, we demonstrate that even when all these factors are held constant, the random selection of examples still results in high variance. Consequently, we aim to explore the informative ability of data examples by quantifying the Information Gain (IG) obtained in prediction after observing a given example candidate. Then we propose to sample those with maximum IG. Additionally, we identify the presence of template bias, which can lead to unfair evaluations of IG during the sampling process. To mitigate this bias, we introduce Calibration Before Sampling strategy. The experimental results illustrate that our proposed method can yield an average relative improvement of 14.3% across six classification tasks using three LLMs.Comment: Accepted to the Findings of EMNLP 202

    Dual Node and Edge Fairness-Aware Graph Partition

    Full text link
    Fair graph partition of social networks is a crucial step toward ensuring fair and non-discriminatory treatments in unsupervised user analysis. Current fair partition methods typically consider node balance, a notion pursuing a proportionally balanced number of nodes from all demographic groups, but ignore the bias induced by imbalanced edges in each cluster. To address this gap, we propose a notion edge balance to measure the proportion of edges connecting different demographic groups in clusters. We analyze the relations between node balance and edge balance, then with line graph transformations, we propose a co-embedding framework to learn dual node and edge fairness-aware representations for graph partition. We validate our framework through several social network datasets and observe balanced partition in terms of both nodes and edges along with good utility. Moreover, we demonstrate our fair partition can be used as pseudo labels to facilitate graph neural networks to behave fairly in node classification and link prediction tasks

    Mining Label Distribution Drift in Unsupervised Domain Adaptation

    Full text link
    Unsupervised domain adaptation targets to transfer task knowledge from labeled source domain to related yet unlabeled target domain, and is catching extensive interests from academic and industrial areas. Although tremendous efforts along this direction have been made to minimize the domain divergence, unfortunately, most of existing methods only manage part of the picture by aligning feature representations from different domains. Beyond the discrepancy in feature space, the gap between known source label and unknown target label distribution, recognized as label distribution drift, is another crucial factor raising domain divergence, and has not been paid enough attention and well explored. From this point, in this paper, we first experimentally reveal how label distribution drift brings negative effects on current domain adaptation methods. Next, we propose Label distribution Matching Domain Adversarial Network (LMDAN) to handle data distribution shift and label distribution drift jointly. In LMDAN, label distribution drift problem is addressed by the proposed source samples weighting strategy, which select samples to contribute to positive adaptation and avoid negative effects brought by the mismatched in label distribution. Finally, different from general domain adaptation experiments, we modify domain adaptation datasets to create the considerable label distribution drift between source and target domain. Numerical results and empirical model analysis show that LMDAN delivers superior performance compared to other state-of-the-art domain adaptation methods under such scenarios

    Tactical Trajectory Planning for Stealth Unmanned Aerial Vehicle to Win the Radar Game

    Get PDF
    In this paper, problem of planning tactical trajectory for stealth unmanned aerial vehicle (UAV) to win the radar game is studied. Three principles of how to win the radar game are presented, and their utilizations for stealth UAV to evade radar tracking are analysed. The problem is formulated by integrating the model of stealth UAV, the constraints of radar detecting and the multi-objectives of the game. The pseudospectral multi-phase optimal control based trajectory planning algorithm is developed to solve the formulated problem. Pseudospectral method is employed to seek the optimal solution with satisfying convergence speed. The results of experiments show that the proposed method is feasible and effective. By following the planned trajectory with several times of switches between exposure and stealth, stealth UAV could win the radar game triumphantly.Defence Science Journal, 2012, 62(6), pp.375-381, DOI:http://dx.doi.org/10.14429/dsj.62.268

    Knowledge Reused Outlier Detection

    Get PDF
    Tremendous efforts have been invested in the unsupervised outlier detection research, which is conducted on unlabeled data set with abnormality assumptions. With abundant related labeled data available as auxiliary information, we consider transferring the knowledge from the labeled source data to facilitate the unsupervised outlier detection on target data set. To fully make use of the source knowledge, the source data and target data are put together for joint clustering and outlier detection using the source data cluster structure as a constraint. To achieve this, the categorical utility function is employed to regularize the partitions of target data to be consistent with source data labels. With an augmented matrix, the problem is completely solved by a K-means - a based method with the rigid mathematical formulation and theoretical convergence guarantee. We have used four real-world data sets and eight outlier detection methods of different kinds for extensive experiments and comparison. The results demonstrate the effectiveness and significant improvements of the proposed methods in terms of outlier detection and cluster validity metrics. Moreover, the parameter analysis is provided as a practical guide, and noisy source label analysis proves that the proposed method can handle real applications where source labels can be noisy

    Affine Transformation Edited and Refined Deep Neural Network for Quantitative Susceptibility Mapping

    Full text link
    Deep neural networks have demonstrated great potential in solving dipole inversion for Quantitative Susceptibility Mapping (QSM). However, the performances of most existing deep learning methods drastically degrade with mismatched sequence parameters such as acquisition orientation and spatial resolution. We propose an end-to-end AFfine Transformation Edited and Refined (AFTER) deep neural network for QSM, which is robust against arbitrary acquisition orientation and spatial resolution up to 0.6 mm isotropic at the finest. The AFTER-QSM neural network starts with a forward affine transformation layer, followed by an Unet for dipole inversion, then an inverse affine transformation layer, followed by a Residual Dense Network (RDN) for QSM refinement. Simulation and in-vivo experiments demonstrated that the proposed AFTER-QSM network architecture had excellent generalizability. It can successfully reconstruct susceptibility maps from highly oblique and anisotropic scans, leading to the best image quality assessments in simulation tests and suppressed streaking artifacts and noise levels for in-vivo experiments compared with other methods. Furthermore, ablation studies showed that the RDN refinement network significantly reduced image blurring and susceptibility underestimation due to affine transformations. In addition, the AFTER-QSM network substantially shortened the reconstruction time from minutes using conventional methods to only a few seconds

    Characterizing the Influence of Graph Elements

    Full text link
    Influence function, a method from robust statistics, measures the changes of model parameters or some functions about model parameters concerning the removal or modification of training instances. It is an efficient and useful post-hoc method for studying the interpretability of machine learning models without the need for expensive model re-training. Recently, graph convolution networks (GCNs), which operate on graph data, have attracted a great deal of attention. However, there is no preceding research on the influence functions of GCNs to shed light on the effects of removing training nodes/edges from an input graph. Since the nodes/edges in a graph are interdependent in GCNs, it is challenging to derive influence functions for GCNs. To fill this gap, we started with the simple graph convolution (SGC) model that operates on an attributed graph and formulated an influence function to approximate the changes in model parameters when a node or an edge is removed from an attributed graph. Moreover, we theoretically analyzed the error bound of the estimated influence of removing an edge. We experimentally validated the accuracy and effectiveness of our influence estimation function. In addition, we showed that the influence function of an SGC model could be used to estimate the impact of removing training nodes/edges on the test performance of the SGC without re-training the model. Finally, we demonstrated how to use influence functions to guide the adversarial attacks on GCNs effectively

    Marginalized Latent Semantic Encoder for Zero-Shot Learning

    Get PDF
    Zero-shot learning has been well explored to precisely identify new unobserved classes through a visual-semantic function obtained from the existing objects. However, there exist two challenging obstacles: one is that the human-annotated semantics are insufficient to fully describe the visual samples; the other is the domain shift across existing and new classes. In this paper, we attempt to exploit the intrinsic relationship in the semantic manifold when given semantics are not enough to describe the visual objects, and enhance the generalization ability of the visual-semantic function with marginalized strategy. Specifically, we design a Marginalized Latent Semantic Encoder (MLSE), which is learned on the augmented seen visual features and the latent semantic representation. Meanwhile, latent semantics are discovered under an adaptive graph reconstruction scheme based on the provided semantics. Consequently, our proposed algorithm could enrich visual characteristics from seen classes, and well generalize to unobserved classes. Experimental results on zero-shot benchmarks demonstrate that the proposed model delivers superior performance over the state-of-the-art zero-shot learning approaches
    • …
    corecore